Datenbank-spektrum Schwerpunkt: Mapreduce Programming Model Compilation of Query Languages into Mapreduce Efficient or Hadoop: Why Not Both? Parallel Entity Resolution with Dedoop Inkrementelle Neuberechnungen in Mapreduce Fachbeitrag towards Integrated Data Analytics: Time Series Forecasting in Dbms

نویسنده

Theo Härder

چکیده

s publiziert/indexiert in Google Scholar, Academic OneFile, DBLP, io-port.net, OCLC, Summon by Serial Solutions. Hinweise für Autoren für die Zeitschrift Datenbank Spektrum finden Sie auf www.springer.com/13222. Datenbank Spektrum (2013) 13:1–3 DOI 10.1007/s13222-013-0116-z

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dedoop: Efficient Deduplication with Hadoop

We demonstrate a powerful and easy-to-use tool called Dedoop (Deduplication with Hadoop) for MapReduce-based entity resolution (ER) of large datasets. Dedoop supports a browser-based specification of complex ER workflows including blocking and matching steps as well as the optional use of machine learning for the automatic generation of match classifiers. Specified workflows are automatically t...

متن کامل

Cloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming

The objective of this study is to verify the importance of the capabilities of cloud computing services in managing and analyzing big data in business organizations because the rapid development in the use of information technology in general and network technology in particular, has led to the trend of many organizations to make their applications available for use via electronic platforms hos...

متن کامل

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...

متن کامل

Efficient Big Data Processing in Hadoop MapReduce

This tutorial is motivated by the clear need of many organizations, companies, and researchers to deal with big data volumes efficiently. Examples include web analytics applications, scientific applications, and social networks. A popular data processing engine for big data is Hadoop MapReduce. Early versions of Hadoop MapReduce suffered from severe performance problems. Today, this is becoming...

متن کامل

A SCALLA: A Platform for Scalable One-Pass Analytics using MapReduce

Today’s one-pass analytics applications tend to be data-intensive in nature and require the ability to process high volumes of data efficiently. MapReduce is a popular programming model for processing large datasets using a cluster of machines. However, the traditional MapReduce model is not well-suited for one-pass analytics, since it is geared towards batch processing and requires the data se...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

نویسنده

چکیده

منابع مشابه

Dedoop: Efficient Deduplication with Hadoop

Cloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

Efficient Big Data Processing in Hadoop MapReduce

A SCALLA: A Platform for Scalable One-Pass Analytics using MapReduce

عنوان ژورنال:

اشتراک گذاری